智能论文笔记

AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

Jingkang Wang , Ava Pun , James Tu , Sivabalan Manivasagam , Abbas Sadat , Sergio Casas , Mengye Ren , Raquel Urtasun

分类：机器人 | 人工智能 | 计算机视觉 | 机器学习

2021-01-16

由于自动驾驶系统变得更好，模拟自动堆栈可能失败的方案变得更加重要。传统上，这些方案对于一些关于将地理演奏器状态作为输入的规划模块而产生的一些场景。这不会缩放，无法识别所有可能的自主义故障，例如由于遮挡引起的感知故障。在本文中，我们提出了对基于LIDAR的自治系统产生了安全性临界情景的促进框架。鉴于初始交通方案，Advsim以物理卓越的方式修改演员的轨迹，并更新LIDAR传感器数据以匹配扰动的世界。重要的是，通过直接模拟传感器数据，我们获得对完整自主堆栈的安全关键的对抗方案。我们的实验表明，我们的方法是一般的，可以识别成千上万的语义有意义的安全关键方案，适用于各种现代自动驾驶系统。此外，我们表明，通过使用Advsim产生的情景训练，可以进一步改善这些系统的稳健性和安全性。

translated by 谷歌翻译

WavEnhancer: Unifying Wavelet and Transformer for Image Enhancement

Zinuo Li , Xuhang Chen , Chi-Man Pun , Shuqiang Wang

分类：计算机视觉

2022-12-16

Image enhancement is a technique that frequently utilized in digital image processing. In recent years, the popularity of learning-based techniques for enhancing the aesthetic performance of photographs has increased. However, the majority of current works do not optimize an image from different frequency domains and typically focus on either pixel-level or global-level enhancements. In this paper, we propose a transformer-based model in the wavelet domain to refine different frequency bands of an image. Our method focuses both on local details and high-level features for enhancement, which can generate superior results. On the basis of comprehensive benchmark evaluations, our method outperforms the state-of-the-art methods.

translated by 谷歌翻译

ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document Shadow Removal

Xuhang Chen , Xiaodong Cun , Chi-Man Pun , Shuqiang Wang

分类：计算机视觉

2022-11-30

Shadow removal improves the visual quality and legibility of digital copies of documents. However, document shadow removal remains an unresolved subject. Traditional techniques rely on heuristics that vary from situation to situation. Given the quality and quantity of current public datasets, the majority of neural network models are ill-equipped for this task. In this paper, we propose a Transformer-based model for document shadow removal that utilizes shadow context encoding and decoding in both shadow and shadow-free regions. Additionally, shadow detection and pixel-level enhancement are included in the whole coarse-to-fine process. On the basis of comprehensive benchmark evaluations, it is competitive with state-of-the-art methods.

translated by 谷歌翻译

Learning the shape of protein micro-environments with a holographic convolutional neural network

Michael N. Pun , Andrew Ivanov , Quinn Bellamy , Zachary Montague , Colin LaMont , Philip Bradley , Jakub Otwinowski , Armita Nourmohammad

分类：机器学习

2022-11-05

Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from structure remains a major challenge. Here, we introduce Holographic Convolutional Neural Network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein function, including stability and binding of protein complexes. Our interpretable computational model for protein structure-function maps could guide design of novel proteins with desired function.

translated by 谷歌翻译

Asymmetric Scalable Cross-modal Hashing

Wenyun Li , Chi-Man Pun

分类：计算机视觉

2022-07-26

跨模式哈希是解决大型多媒体检索问题的成功方法。提出了许多基于矩阵分解的哈希方法。但是，现有方法仍然在一些问题上遇到困难，例如如何有效地生成二元代码，而不是直接放松它们的连续性。此外，大多数现有方法选择使用$ n \ times n $相似性矩阵进行优化，这使得内存和计算无法承受。在本文中，我们提出了一种新型的不对称可伸缩式模式哈希（ASCMH）来解决这些问题。首先，它引入了集体矩阵分解，以从不同模态的内核特征中学习一个共同的潜在空间，然后将相似性矩阵优化转换为距距离距离差异问题，并借助语义标签和共同的潜在空间。因此，$ n \ times n $不对称优化的计算复杂性得到了缓解。在一系列哈希码中，我们还采用了标签信息的正交约束，这对于搜索准确性是必不可少的。因此，可以大大减少计算的冗余。为了有效的优化并可扩展到大规模数据集，我们采用了两步方法，而不是同时优化。在三个基准数据集上进行了广泛的实验：Wiki，Mirflickr-25K和NUS范围内，表明我们的ASCMH在准确性和效率方面表现出了最先进的跨模式散列方法。

translated by 谷歌翻译

Arbitrary Style Transfer with Structure Enhancement by Combining the Global and Local Loss

Lizhen Long , Chi-Man Pun

分类：计算机视觉

2022-07-23

任意样式转移生成了艺术图像，该图像仅使用一个训练有素的网络结合了内容图像的结构和艺术风格的结合。此方法中使用的图像表示包含内容结构表示和样式模式表示形式，这通常是预训练的分类网络中高级表示的特征表示。但是，传统的分类网络是为分类而设计的，该分类通常集中在高级功能上并忽略其他功能。结果，风格化的图像在整个图像中均匀地分布了样式元素，并使整体图像结构无法识别。为了解决这个问题，我们通过结合全球和局部损失，引入了一种新型的任意风格转移方法，并通过结构增强。局部结构细节由LapStyle表示，全局结构由图像深度控制。实验结果表明，与其他最新方法相比，我们的方法可以在几个常见数据集中生成具有令人印象深刻的视觉效果的更高质量图像。

translated by 谷歌翻译

The Third Place Solution for CVPR2022 AVA Accessibility Vision and Autonomy Challenge

Bo Yan , Leilei Cao , Zhuang Li , Hongbin Wang

分类：计算机视觉 | 人工智能

2022-06-28

AVA挑战的目标是提供与可访问性相关的基于视觉的基准和方法。在本文中，我们将提交的技术细节介绍给CVPR2022 AVA挑战赛。首先，我们进行了一些实验，以帮助采用适当的模型和数据增强策略来完成此任务。其次，采用有效的培训策略来提高性能。第三，我们整合了两个不同分割框架的结果，以进一步提高性能。实验结果表明，我们的方法可以在AVA测试集上获得竞争结果。最后，我们的方法在CVPR2022 AVA挑战赛的测试集上实现了63.008 \％ap@0.50：0.95。

translated by 谷歌翻译

Image Harmonization with Region-wise Contrastive Learning

Jingtang Liang , Chi-Man Pun

分类：计算机视觉

2022-05-27

Image harmonization task aims at harmonizing different composite foreground regions according to specific background image. Previous methods would rather focus on improving the reconstruction ability of the generator by some internal enhancements such as attention, adaptive normalization and light adjustment, $etc.$. However, they pay less attention to discriminating the foreground and background appearance features within a restricted generator, which becomes a new challenge in image harmonization task. In this paper, we propose a novel image harmonization framework with external style fusion and region-wise contrastive learning scheme. For the external style fusion, we leverage the external background appearance from the encoder as the style reference to generate harmonized foreground in the decoder. This approach enhances the harmonization ability of the decoder by external background guidance. Moreover, for the contrastive learning scheme, we design a region-wise contrastive loss function for image harmonization task. Specifically, we first introduce a straight-forward samples generation method that selects negative samples from the output harmonized foreground region and selects positive samples from the ground-truth background region. Our method attempts to bring together corresponding positive and negative samples by maximizing the mutual information between the foreground and background styles, which desirably makes our harmonization network more robust to discriminate the foreground and background style features when harmonizing composite images. Extensive experiments on the benchmark datasets show that our method can achieve a clear improvement in harmonization quality and demonstrate the good generalization capability in real-scenario applications.

translated by 谷歌翻译

Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization

Jingtang Liang , Xiaodong Cun , Chi-Man Pun , Jue Wang

分类：计算机视觉

2021-09-13

图像协调旨在根据具体背景修改复合区域的颜色。以前的工作模型是使用Unet系列结构的像素-ID映像转换。然而，模型大小和计算成本限制了模型在边缘设备和更高分辨率图像上的能力。为此，我们首次提出了一种新的空间分离曲线渲染网络（S $ ^ 2 $ CRNET），首次进行高效和高分辨率的图像协调。在S $ ^ 2 $ CRNET中，我们首先将屏蔽前景和背景的缩略图中提取空间分离的嵌入物。然后，我们设计一种曲线渲染模块（CRM），其使用线性层学习并结合空间特定知识，以生成前景区域中的方向曲线映射的参数。最后，我们使用学习的颜色曲线直接渲染原始的高分辨率图像。此外，我们还通过Cascaded-CRM和语义CRM分别进行了两个框架的延伸，分别用于级联细化和语义指导。实验表明，与以前的方法相比，该方法降低了90％以上的参数，但仍然达到了合成的iHarmony4和现实世界DIH测试集的最先进的性能。此外，我们的方法可以在0.1秒内在更高分辨率图像（例如，2048美元\ times2048 $）上顺利工作，而不是所有现有方法的GPU计算资源。代码将在\ url {http://github.com/stefanleong/s2crnet}中提供。

translated by 谷歌翻译

Distributionally Robust Graph Learning from Smooth Signals under Moment Uncertainty

Xiaolu Wang , Yuen-Man Pun , Anthony Man-Cho So

分类：机器学习

2021-05-12

我们考虑从有限的嘈杂图形信号观察中学习图表的问题，其目标是找到图形信号的平滑表示。这种问题是通过在大型数据集中推断的关系结构，并且近年来广泛研究了这种问题。大多数现有方法专注于学习观察信号平滑的图表。但是，学习的图表容易过度拟合，因为它不会考虑未观察到的信号。为了解决这个问题，我们提出了一种基于分布稳健优化方法的新型图形学习模型，该模型旨在识别不仅提供了对观察信号中的不确定性的平滑表示的图表。在统计方面，我们建立了我们提出的模型的样本绩效保障。在优化方面，我们表明，在曲线图信号分布的温和假设下，我们提出的模型承认了平滑的非凸优化配方。然后，我们开发了一个预测的渐变方法来解决这一制定并建立其收敛保证。我们的配方在图形学习环境中提供了一个新的正则化视角。此外，综合和实世界数据的广泛数值实验表明，根据各种度量的观察信号的不同群体的模型具有比较不同的群体的较强的性能。

translated by 谷歌翻译